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Classical  Computational  Multibody  Dynamics: 
Newton-Euler  Constrained  Equations  of  Motion 
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Generalized  Positions 
Generalized  Mass  Matrix  - 


Velocity  Transformation  Matrix 
Generalized  Velocities 


q=L(q)v 


Reaction 

Force 

I  '  I 


M(q) v  =  f  0,  q,  v)  -  gj  (q,  t)X 

' - 'TJ 

Applied  Force 

g(q,0=o 
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Multibody  Dynamics:  What  is  it?  [commercial  software  simulation] 
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Multibody  Dynamics:  What  is  it?  [commercial  software  simulation] 
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Example  Open  Problem:  Mobility  on  Deformable  Terrain 


How  is  the  Rover  moving  along  on  a  slope  with  granular  material? 
What  wheel  geometry  is  more  effective? 

How  much  power  is  needed  to  move  it? 

At  what  grade  will  it  get  stuck? 

And  so  on... 
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Frictional  Contact  Simulation 
[Commercial  Software  Simulation  -  2007] 
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•  Model  Parameters: 

•  Spheres:  60  mm  diameter  and  mass  0.882  kg 

•  Penalty  Approach:  stiffness  of  iE5,  force  exponent  of  2.2,  damping  coefficient  of  10.0 

•  Simulation  length:  3  seconds 
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Frictional  Contact  Simulation 
[Commercial  Software  Simulation  -  2013] 


•  Same  problem  tested  in  2013 


•  Simulation  time  reduced  by  a  factor  of  six 


•  Simulation  times  still  prohibitively  long 


CPU  time  v.  Number  of  Spheres  v 

700 

600 

500 


0  10  20  30  40 

Number  of  Spheres  [-] 
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=  0.186)t2-2.595Bo< +6.929 
R1  =  0.9998 


60  70 
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Why  It's  Worth  Reconsidering  Challenging  Problems... 
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Lab's  Research  Heterogeneous  Cluster 


Legend,  Connection  Type: 
—Gigabit  Ethernet^ 

— 4x  QDR  Infiniband— 


File  Server  Architecture 


Gigabit  Ethernet 
Switch 


Internal  Users 


Head  Node 


Remote 

Collaborators 


4x  QDR  Infiniband 
Switch 


CPU/GPU  Node  Architecture 


AMD  Node  Architecture 
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Lab's  Research  Heterogeneous  Cluster 
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•  More  than  50,000  GPU  scalar  processors 

•  More  than  1,200  CPU  cores 

•  Fast  Mellanox  Infiniband  Interconnect  (QDR),  4oGb/sec 

•  About  2.7TB  of  RAM 

•  More  than  2oTflops  Double  Precision 


The  issues  is  not  hardware  availability.  Rather,  it  is  producing  modeling  and  solution 
techniques  that  can  leverage  the  hardware 
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CHRONO: 

Research-Grade  Software  Infrastructure  for  Multi-physics  Modeling/Simulation/Visualization 


•  Goal:  advance  state  of  the  art  in  modeling,  simulation,  and  visualization 

•  Use  emerging  hardware  and  novel  algorithms  to  solve  open  engineering  problems 


•  “emerging  hardware” : 

•  GPUs  and  clusters  of  CPUs 


•  “open  engineering  problems” : 

•  Fluid-solid  interaction,  vehicle  mobility,  soil  modeling,  tire/terrain  modeling,  granular  dynamics,  etc. 
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Chrono: 

Five  Foundation  Components 
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•  Advanced  modeling 

•  Solution  methods 

•  Proximity  computation 

•  Domain  decomposition  &  Inter-domain  data 
exchange 

•  Pre/Post-processing  (visualization) 


•  Chrono: 

•  Five  foundation  components  support  vertical  apps 
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Advanced  Modeling  Techniques 


Advanced  modeling  techniques 
Algorithmic  (applied  math)  support 
Proximity  computation 

Domain  decomposition  &  Inter-domain  data  exchange 
Post-processing  (visualization) 
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Support  for  Advanced  Modeling  Techniques 
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•  Modeling;  what  does  it  mean? 

•  The  process  of  formulating  a  set  of  governing  differential  equations  that  captures  the  physics  associated  with  the 
engineering  problem  of  interest 


•  Modeling  decisions  are  consequential 

•  Hallmark  of  good  modeling:  it  leads  to  a  palatable  math  problem  that  can  be  solved  numerically  with  relative  ease 
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Chrono::Flex  -  Dealing  with  Compliant  Bodies 
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Deformable  Body  Modeling  Support  in  Chrono 


Equation  of  Motion  &  Mass  Matrix  System  Forces 


•  Equations  of  Motion: 

Me  +  Q,=Q, 


•  Mass  matrix  is  constant  and  SPD 


JyoSTSdV„ 

K 


•  Due  to  gravity 

i 

Qe  =  Aj  STfgdx 

o 

•  Due  to  a  concentrated  force: 

Qe  =  Srf 
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Deformable  Bodies:  Internal  Forces... 
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•  Strain  Energy  (shown  for  beam  elements): 


1  l  i  ^ 

U  =  -\EA(snfdx+  -\EI(K)2dx 

2  0  2  o 


•  Partial  Derivative  of  Strain  Energy  wrt  generalized  coordinated  e  yields  the  Internal  Forces 


/ 

Qs={£A(fll) 

0 


dx 


•  Bad  news:  internal  force  expensive  to  evaluate 

•  Good  News:  for  large  systems  can  be  done  in  parallel 
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Deformable  Bodies  with  Constraints 
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Deformable  Bodies  with  Constraints 
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•  Constraints  assumed  holonomic,  formulated  as 

®(q,r)  =  [<P1(q,/)...«t„,(q,Of  =0 


•  Constraint  form  of  the  Equations  of  Motion  (index  3  DAE  problem) 

Mq  +  Oq  (q,  t)X  +  Qint  (q)  =  Qext  (q,  q,  t) 
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Wiggly  Bodies 

[Flexible  bodies,  w/  Friction  and  Contact:  parallel  simulation  on  the  GPU] 
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Deformable  Bodies  with  Friction  and  Contact 
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•  Contact  forces  depend  on  several  parameters: 

•  Contact  penetration 

•  Normal  vector 

•  Relative  velocity  of  colliding  bodies  at  the  point  of  contact 

•  Etc. 


•  Normal  force  due  to  a  collision  calculated  as 


=  KS" 


K  = 


3(<7,.+<7  ) 


R.+R 


0.5 


•  Damping  and  friction  can  also  be  introduced 

•  Relies  on  a  spherical  decomposition  of  the  geometry 
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Ball  -  Deformable  Net  Interaction 
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Chrono::Rigid  -  Mixing  50,000  M&Ms  on  the  GPU 
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Many-Body  Dynamics  with  Friction  and  Contact 
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h=io  cm,  pb=2.2  g/cm3 
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Chrono::Flow  Particle  in  Suspension  F  low 
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Coupled  Problem:  Fluid-Solid  Interaction 


Generalized  Positions 
Generalized  Mass  Matrix 


Velocity  Transformation  Matrix 
Generalized  Velocities 

Frictional 


q=L(q)v 


Reaction 


Force  r“ 

I - * - 1  Nc 


Contact  Force 

A 


/  =  1 


Conservation  of  mass: 


M(q)v  =  f(Z,q(v)-g;(q,0>.  +  X(KD„r''  +KK‘  +MJ') 

Applied  Force 

g(q,0  =  o 


dp  dv@ 
dt  ^  dx@ 


Conservation  of  momentum: 


Conservation  of  energy: 


dv 


a 


_  1  da a/3  fa 
dt  p  dx&  +  p 

du  craP  dv 


,a 


dt 


p  dx& 
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Algorithmic  (applied  math)  support 


Advanced  modeling  techniques 
Algorithmic  (applied  math)  support 
Proximity  computation 

Domain  decomposition  &  Inter-domain  data  exchange 
Post-processing  (visualization) 
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Deformable  Bodies:  Implicit  Integration  using  Newmark... 


Newmark  Integration  Formula 

•  New  positions  and  velocities 
obtained  at  tn+1  based  on  new 
accelerations  &  Lagrange  multipliers 

qw+,  =  q„  +  hqn  + y  [(1  ■ -  2  p)  q„  +  2/?q„+1  ] 

q„+1  =<i„  +/?[(!- r)q„ +r<i„+i_ 


Index  3  DAE  Problem 


•  Discretized  Equations  of  Motion  attn+1: 

(Mq)„,+(®1rX)_+(Q„-Q„,)_|  =0 


•  Kinematic  constraints  evaluated  at  new 
time  step  tn+1: 

O(qn+1,tn+1)  =  0 
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Deformable  Bodies:  Implicit  Integration  using  Newmark... 


Solving  an  index  3  set  of  Differential  Algebraic  Equations  (DAEs)  w /  implicit  integration 

•  Relies  on  Newton-Krylov  approach  to  solve  nonlinear  problem  at  each  time  step 

•  Updates  in  the  accelerations  at  iteration  (k)  computed  as 

(*) 


/V 

M 

1 - 

e 

Aq 

( k ) 

-ei" 

O 

L  q 

0 

AA, 

_e2_ 

Residuals  capture  error  in  satisfying  the  equations  of  motion  and  the  kinematic  constraint  equations: 

ei  =  (Mq)„+1  +  (®^)„+1  +  (Qint  )„+1  -  (Qext ), 

1 


'  n+1 


e2  = 


[ih 
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Jacobian  Matrix  Computation,  Flex  Body  Dynamics 


•  Sensitivity  computation  costly 


/v  0P 

M  =  1  =  M  hy 

dQex, 

+  /3h2 

dq 

dq 

dq  dq 

Computational  bottleneck  is  evaluation  of  sensitivity  of  internal  forces: 


dQint  f  ^  d  (d£n^ 


dq 


=  J  EA(eu) 


0 


9ev  de 


J 


i 

dx  +|  EA 


(  dsn  Yf  ds,A 


o 


V 


de 


J 


'ii 


V 


de 


J 


i 

dx+  J  EI(k) 


d  ( d/Y 


o 


de  v  de  J 


dx+  J  El 

o 


(  d/cYd  dic\ 
V  de  J  de 


dx 


J 
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0.25  km  Net  Simulation:  101,025  Beams  &  640,146  Constraints 
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200,000  Bodies  &  10  kg  Anchor 

[solved  in  parallel  on  the  GPU] 

i 


<*S0SBEL 


8/26/2013 


UNCLASSIFIED:  Distribution  Statement  A.  Approved  for  public  release.  #24389 


36 


Wisconsin  Applied  Computing  Center 


7V\ 


Cut-away,  Anchoring  Simulation 
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0.5  Million  Rigid  Bodies 
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•  Granular  material  has  unique  properties 


8/26/2013 


UNCLASSIFIED:  Distribution  Statement  A.  Approved  for  public  release.  #24389 


39 


Wisconsin  Applied  Computing  Center 


7V\ 


Rigid  Body  Dynamics  w /  DVI :  From  Continuum  to  Discrete 


•  Seeking  to  solve  numerically  the  equations  of  motion  robustly  &  effectively 

•  Discussion  focused  here  on  many-body  dynamics 


Generalized  Positions 


Velocity  Transformation  Matrix 


Generalized  Mass  Matrix 


Generalized  Velocities 


Frictional 


q  =T(q)v  Reaction 
Force 


Contact  Force 

A 


AC 


M(q)v  =  f(/,q.  v)  -  gj  (q,t)X  +  +  /X’‘  +  KX ) 

-Y^ 


Applied  Force  '  1 
gOl>  0  -  0  Contact  Impulse,  for  Contact  “i” 

i  =  l,2,...,^ 

/  \  Total  Number 


0<9'(q,0  1  y'n>  0 


(^,4)=/  argmin^  fry  V„+r>  K\  ofContacts 

A  iMr„^(ruy +(k)  —y Z7.  .  .  ^ 

/  Friction  Dissipation  Energy 

Gap  Function,  for  Contact  “i” 

Friction  Impulse  Components,  for  Contact  “z” 
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The  Discretized  Problem:  from  t,  to  t 


+1 


positions 


time  step  index 


qpT1)  = 


Mass  Mat. 


speeds 


M 


.(/+!)  _ 


Applied  Forces 


Reaction 

impulses 


—  fl{(t^\  V^)  +  Sis^((;(i),i5)  ('7i,n  Pj,n  ~l~  1i,u  D i,u  ~H  ~ji,w  Dj,w) 


<  i^(qW)  +  D?nv(‘+1YI)7;  >  o. 

_ 


V  (Ti.u  “I”  'Yi,w  . 


Complementarity 

Condition 


Coulomb  3D  friction 
model 


(D.  Stewart,  1998) 
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The  Cone  Complementarity  Problem  (CCP) 


**fSBEL 


Define  the  convex  hypercone... 


T  =  I  ©  TC 

viG-4(qz,e) 


TC%  6  M3  represents  friction  cone  associated  with  itn  contact 


...  and  its  polar  hypercone: 


T°  =  I  ©  TCio 

q*,e) 


*  The  problem  can  be  formulated  as  find  y  that  solves  the  following  CCP 

7  <E  T  _L  — (N7  +  d)  €  T° 
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The  Quadratic  Programming  Angle... 


**fSBEL 


•  The  CCP  captures  the  first-order  optimality  condition  for  a  quadratic  optimization  problem 
with  conic  constraints: 

(  minq(7)  =  ^7TN7  +  dT7 
I  subject  to  7 i  E  T i  for  i  =  1, 2, . . . ,  Nc 


Notation  used: 


7=[7f,72T,---,7jUJ'eM 


T  l  T 


>3x  7Vr 


and 


Ti  :  (7w,i  +  7w,i)  _  /ii7n,i  ^  0 
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Test  Problem:  1000  rigid  spheres  with  3525  contacts 


•  Cost  function  depends  on  3,525  variables  (or  about  10,500  if  friction  is  present) 


SBEL 
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Objective  Function  Value 
[iK  bodies,  3525  contacts] 
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Jacobi 


f+1  =  7r+wB[N7r  +  r] 

7r+1  =  P(7r+1) 

7r+1  =  A7r+1  +  (1  -  A)7r  . 

m  o 

B  = 

0  Vnc  . 

1 

Vi  ~  Trace(DfM-1Di) ' 
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GP-MINRES 

Algorithm  GPMINRES(N,  r,  r,  771,  772,  iVmaa;,  Mmax) 

(1) 

/v(°) - 0 

i  • —  Vnc 

(2) 

for  k  :=  0  to  Nmax 

(3) 

y(0)  _  7(fc) 

(4) 

while  agressively  changing  active  set  and  reducing  cost  function 

(5) 

y(j+i)  _  p[yU)  -  otjWq{ y(j))] 

(6) 

3=3  +  1 

(7) 

endwhile 

(8) 

^(fe)  y(j) 

(9) 

Determine  active  set  A( 7^)  and  Z&  and 

(10) 

w0  =  0mfc 

(11) 

for  j  . —  0  to  Afrnax 

(12) 

MINRES  step:  w^')  wk'+1> 

(13) 

3=3  +  1 

(14) 

if  slughish  convergence 

(15) 

break 

(16) 

enfor 

(17) 

Set  Wfe  := 

(18) 

Get  — >•  backtracking  line-search  with  direction  dfc  =  Z^Wfe 

(19) 

if  l|Vsi?(7<i:+1))l|oo  <r 

(20) 

break 

(21) 

enfor 

(22) 

return  Value  at  time  step  ti+ 1,  'yl+1  :=  7(fc+1)  . 
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P-SPG-FB 


Algorithm  P-SPG-FB(N,  r,  x0, 7C,  P  i-t  x) 


(1) 

(2) 

(3) 

(4) 

(5) 

(6) 

(7) 

(8) 

(9) 

(10) 
(11) 
(12) 

(13) 

(14) 

(15) 

(16) 

(17) 

(18) 

(19) 

(20) 
(21) 
(22) 

(23) 

(24) 

(25) 

(26) 
(27) 


^29 


X()  * —  II/c(Xo),  X-FB  —  X0,  Aq  G  \oLmin  ?  &max\ 

go  :=  Nx()  +  r,  /(x0)  =  |x[Nx0  +  xjr,  w0  =  102 
for  j  :=  0  to  Nmax 

P  j  =  p_1Sj 

dy  =  n^(xy  -  djPj)  -  Xj 

if  (dj,gi)  >  0 

dy  -  IlK;(Xj  -  OCjg j)  -  Xj 

A  :=  1 

while  line  search 

Xj+ 1  :=  xj  +  Ad  j 

Sj+i  :=  Nxj+i  +  r 

f(xj+i)  =  5xJ+1Nxj+i  +  xj+1r 

if  f(xj+i)  >  .  max  / (xj-i)  +  7A  (dj ,  gj) 
i=0,..,min(j,NGLL) 

define  Anew  €  [crminA,  crmaxA]  and  repeat  line  search 
else 

terminate  line  search 
s j  =  xi+i  -  Xj 

y j  —  Sj+i —  §j 
if  j  is  odd 

-  =  (sj,PSj) 

<Sj  ,yj ) 

else 

a  . , ,  -  (a. yj) 

a>+ 1  “  (yj,P-lyj> 

dj-(-i  =  inin(amax,  max(amin,  dy+i)) 

Wj+ 1  =  ||[xi+i  -  n/c(xj+i  -  rffgj+i)]/rfl||2  =  ||e||2 
if  w,+i  <  min  wt 

fc=0,..,j 
XFB  =  Xj+1 
return  x/r/j 
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Algorithm  KUCERAfN.  r.  x°.  /C.  r  >  0.  a  e  (0.  liNH"1!.  €  >  0)  ^Mf/SBEL 

a) 

k  =  0 

Kucera  Algorithm 

(2) 

(3) 

g  =  Nx°  +  r 

p  =  0(x°) 

(4) 

while  ||g(xfc)||  >  e 

(5) 

if  0(xk)Tg{xk)  <  r2j>(xk)Tg(xk ) 

(6) 

CXCg  =  gTp/pTNp 

(7) 

.  /  \  v  jWp*»  if  Pi  >  0 

Qf  =  min[Oif  j)  where  a/,  =  < 

1  oo,  if  p,:  <  0 

(8) 

if  CXqq  CV £ 

(9) 

xfc+1  =  xfc  -  acgp 

(10) 

g  =  g  -  acgNp 

(ID 

7  =  0(x/c+1)TNp/pTAp 

(12) 

V 

II 

-e- 

?r 

+ 

»—i 

i 

T3 

(13) 

else 

(14) 

xfc+l/2  _  xfc  _  Q/p 

(15) 

xfc+i  _  xfc+i/2  _  a(fi(x.k+1/2) 

(16) 

g  =  Nxfc+1  +  r 

(17) 

p  =  0(xfc+1) 

(18) 

else 

(19) 

XA:+1  _  xk  _  a{3(xk) 

(20) 

g  =  Nxfc+1  +  r 

(21) 

p  =  <fr(xk+1) 

(22) 

k  =  k  +  1 

(23) 

return  xAr 
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Nesterov's  Accelerated  Projected  Gradient  Descent 

ALGORITHM  NAPCi(iV.  7\  t  <  T  *  ,  r,  Nmaa.) 

AmaxyW  ) 

(1) 

7  0  — 

(2) 

7o  =  In* 

(3) 

<cr 

0 

II 

0 

(4) 

0o  =  l 

(5) 

for  k  :=  0  to  Nmax 

(6) 

II 

1 

w 

(7) 

7fc+i  =  nK  (yk  -  tg) 

(8) 

0  -0j l+0ky/0l+4 

Vk+l  =  - 2  - 

(9) 

Pk+ 1  -  ^^+4+1 

(10) 

2/fc+i  =  7fc+i  +  Pk+ 1  (7fc+i  -  7*) 

(M) 

£  =  e  (7fc+i) 

(12) 

if  €  <  T 

(13) 

break 

(14) 

endif 

(15) 

endfor 

(16) 

return  Value  at  time  step  ti+ 1, 7/+1  :=  7  . 
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Relative  Speedup:  Benchmark  Problem 
[1000  bodies] 


SBEL 


Number  of  iterations  to  convergence  Speedup:  APGD  vs.  Jacobi  and  SOR 
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**fSBEL 


Proximity  computation 

Advanced  modeling  techniques 
Algorithmic  (applied  math)  support 
Proximity  computation 

Domain  decomposition  &  Inter-domain  data  exchange 
Post-processing  (visualization) 


t 
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Tracked  Vehicle  Simulation 
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6oo,ooo  Bodies  Moving  &  Colliding 

[run  on  the  GPU] 


SBEL 
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Ellipsoid-Ellipsoid  CD 


**fSBEL 


8/26/2013 


UNCLASSIFIED:  Distribution  Statement  A.  Approved  for  public  release.  #24389 


57 


Wisconsin  Applied  Computing  Center 


Various  Geometries  Handled... 
[Ellipsoid-Ellipsoid  Example] 


d  =  P,-P2  =(—  M,+— M2)c  +  (b,-b2) 


22, 


2/L 


a2d  a2p,  8% 

da.  da.  da.  ’  dad  a.  dad  a.  dad  a. 

i  i  i  i  j  i  j  i  j 


dP 

da{ 


I  m  .  1  r «  „ ,  <3C 

=  ( —  M - ^  Mcc  M) - 

2/1  8A3  dat 
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A :  Rotation  Matrix 

M  =  ARA 

R  =  diag(rx ,  r2,r3) 
b :  Translation  of 
ellipsoids  center 

A2  =  inrMn 
4 


min|c/(<2l,cr2)||2 

a  |  ,«2 
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Collision  Detection 


**fSBEL 


•  Broad  phase 

•  Draws  on  an  Axis  Aligned  Bounding  Box  (AABB)  approach 


•  Narrow  phase 

•  Draws  on  Minkowski  Portal  Refinement 
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CD:  Binning 


**fSBEL 


1 

2 

3 

4 

5 

A 

A 

A/ 
/  I 

"47 

•  Example:  2D  collision  detection,  bins  are  squares  B 

c 

\  ^ 

^  /  5 

N 

y 

y 

c 

7 

c 

1 

Vi 

y(  8, 

D 

6 

10j 

•  Body  4  touches  bins  A4,  A5,  B4,  B5  E 

^ 

•  Body  7  touches  bins  A3,  A4,  A5,  B3,  B4,  Bs,  C3,  C4,  C5 

•  In  proposed  algorithm,  bodies  4  and  7  will  be  checked  for  collision  by  three  threads 
(associated  with  bin  A4,  A5,  B4) 


8/26/2013 


UNCLASSIFIED:  Distribution  Statement  A.  Approved  for  public  release.  #24389 


60 


Wisconsin  Applied  Computing  Center 


7V\ 


Stage  1  (Body  Parallel) 


Key  observation:  it's  easy  to  bin  bodies 


SBEL 


A 

\ 

/  /\ 

AY 
-  /  ( 

^  4Y, 

•  Purpose:  find  the  number  of  bins  touched  by  each  body 

B 

€ 

7 

•  Store  results  in  the  "T",  array  of  N  integers 

c 

( 

vL 

\  (  & 

D 

/ 

i  6 

k3\ 

id] 

1 

E 

\ 

^ 

1  2 


4  3  4  4  5 


^ — T- 


array 
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Stage  2:  Parallel  Inclusive  Scan 


**fSBEL 


•  Run  a  parallel  inclusive  scan  on  the  array  T 
•  The  last  element  is  the  total  number  of  bin  touches,  including  the  last  body 


•  Complexity  of  Stage:  O(N)  -  using  thrust  library 


•  Purpose:  determine  the  number  of  entries  M 
needed  to  store  the  indices  of  all  the  bins 
touched  by  each  body  in  the  problem 
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Stage  3:  Determine  bin-to-body  association 


•  Stage  executed  in  parallel  on  a  per-body  basis 

1 

2 

3 

4 

5 

A 

) 

^4\ 

•  Allocate  an  array  B  of  M  pairs  of  integers. 

B 

€ 

X 

l 

7 

/ 

c 

> 

y 

•  The  key  (first  entry  of  the  pair),  is  the  bin  index 

D 

6 

\ 

C>\ 

10) 

E 

"■ — ^ 

The  value  (second  entry  of  pair)  is  the  body  that  touches  that  bin 


C- 


The  Value 


1 

1 

1 

1 

2 

2 

2 

3 

3 

3 

3 

4 

♦ 

♦ 

* 

♦ 

♦ 

* 

t 

t 

♦ 

♦ 

t 

f  B-array 

B1 

B2 

Cl 

C2 

A2 

A3 

B2 

A1 

A2 

B1 

B2 

A4  ... 

-The  Key 
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Stage  4:  Radix  Sort 


**fSBEL 


•  In  parallel,  run  radix  sort  to  order  the  B  array  according  to  key  values 


1  2  3  4  5 


•  Work  load:  O(N) 

A 

(2^ 

A 

\ 

A/ 

/ 

^4\ 

•  Relies  on  thrust  library 

B 

c 

\  ^  ^ 

“  l 

7 

C 

( 

Vi 

4 

-  The  Value  D 

6 

vbX 

10] 

J 

3 

2 

3 

2 

5  7  4  7 

4  7  1  3  ... 

. 

/ 

* 

t 

t 

t 

4  4  4  4 

4  4  4  4  B-array  E 

^  The  Key 


A1 

A2 

A2 

A3 

A3 

A3 

A4 

A4 

A5 

A5 

B1 

B1 

... 
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Stage  5:  Find  Bin  Starting  Index 


Host  allocates  on  device  an  array  of  length  Nb  of  pairs  of  unsigned  integers 
Run  in  parallel,  on  a  per  bin  basis: 

•  Load  in  parallel  in  shared  memory  chunks  of  the  B  array  and  find  the  location  where  each  bin  starts 

•  Store  it  in  entry  k  of  C,  as  the  key  associated  with  this  pair 

•  Key  of  bins  with  one  or  no  bodies  is  set  to  maximum  unsigned  int  value  of  Oxffffffff 

1  9  2  A  R 


r 


The  Value 


3 

2 

3 

2 

5 

7 

4 

7 

4 

7 

1 

3 

♦ 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

f  B-array 

A1 

A2 

A2 

A3 

A3 

A3 

A4 

A4 

A5 

A5 

B1 

B1 

The  Key 


8/26/2013 


UNCLASSIFIED:  Distribution  Statement  A.  Approved  for  public  release.  #24389 


65 


Wisconsin  Applied  Computing  Center 


7V\ 


Stage  6:  Sort  C  for  Pruning 


**fSBEL 


•  Do  a  parallel  radix  sort  on  the  array  C  based  on  the  key 

•  Purpose:  move  unused  bins  to  the  end  of  array 

•  Effort:  0(Nb) 


The  Value— | 

0 

1 

3 

6 

8 

10 

•  •  • 

C-array 

♦ 

♦ 

♦ 

♦ 

t 

♦ 

The  Key  — •> 

Oxfff 

2 

3 

2 

1  2 

2 

•  •  • 

A1 

A2 

A3 

A4 

A5 

B1 

•  •  • 

■\ 


Becomes 
this 


The  Value  -4  1 

6 

8 

10 

3 

... 

C-array 

The  Key — <>  2 
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Stagey:  Investigate  Collisions  in  each  Bin 


**fSBEL 


Carried  out  in  parallel,  one  thread  per  bin 


0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

3 

2 

3 

2 

5 

7 

4 

7 

4 

7 

1 

3 

1 

6 

8 

10 

3 

... 

♦ 

♦ 

♦ 

♦ 

♦ 

♦ 

4 

♦ 

♦ 

♦ 

^  B-array 

$ 

♦ 

$  C-array 

A1 

A2 

A2 

A3 

A3 

A3 

A4 

A4 

A5 

A5 

B1 

B1 

2 

LjJ 

3 

■ 

•  To  store  information  generated  during  this  stage  host  allocates  unsigned  integer  array  D  of  length  Nb 

•  Array  D  stores  the  number  of  actual  contacts  occurring  in  each  bin 

•  D  is  in  sync  with  (linked  to)  C,  which  in  turn  is  in  sync  with  (linked  to)  B 

•  Parallelism:  one  thread  per  bin 

•  Thread  k  reads  the  pair  key-value  in  entry  k  of  array  C 

•  Thread  k  reads  does  rehearsal  for  brute  force  collision  detection 

•  Outcome:  the  numbers  of  active  collisions  taking  place  in  a  bin 

•  Value  s  stored  in  kth  entry  of  the  D  array 
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Stagey:  Details 


Recall  that  how  C  is  organized  is  a  reflection  of  how  B  is  organized 


0 

1 

2 

3 

4 

5 

(6) 

f 

8 

9 

10 

11 

3 

2 

3 

2 

5 

7 

4 

7 

4 

7 

1 

3 

... 

4 

4 

4 

4 

4 

4 

( 

4 

4 

4 

4  1 

B-arra 

A1 

A2 

A2 

A3 

A3 

A3 

A 

4 

A 

4 

A5 

A5 

B1 

B1 

... 

Bin  vs.  Body  4 

Touching  This  Bin 

1 

6 

8 

10 

3 

... 

4/ 

7  4 

4 

4 

4  C-array 

.  (2 ) 

2 

La 

U 

Bln  offset  in  B  and 
number  of  bodies 
touching  that  bin 


The  drill:  thread  o  relies  <bn  info  thread  l/elies  on  info  at  C[i],  etc. 

Let's  see  what  thread-2r<ijbes  with  C[2])  do^< 

•  Read  the  first  2l5odies  that  start  at  offset  6ln  B. 

•  These  bodies  are  4  and  7,  and  as  B  indicates,  they  touch  bin“A4 

•  Bodies  4  and  7  turn  out  to  have  1  contact  in  A 4,  which  means  that  entry  2  of  D  needs 


B 

C 


to  reflect  this 


0 


0 

1 

0 

0 

0 

... 

D-array  (Length:  Nb) 
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(2 

./* 

^4^ 

€ 

> 

7 

ll 

1 

/ 

(6 

foj 

/ 

(A2)  (A4)  (A5)  (B1)  (A3) 
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Stagey:  Details 


#tl  SBEL 


•  Brute  Force  CD  rehearsal 

•  Carried  out  to  understand  the  memory  requirements  associated  with  collisions  in  each  bin 

•  Finds  out  the  total  number  of  contacts  owned  by  a  bin 

•  Key  question:  which  bin  does  a  contact  belong  to? 

•  Answer:  It  belongs  to  bin  containing  the  CM  of  the  Contact  Volume  (CMCV) 


A 

B 

C 

D 

E 


A 


A 

8 


- - 


4 


A/ 


10 


Zoom 

in... 
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Stage  7,  Comments 
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•  Two  bodies  can  have  multiple  contacts,  handled  ok  by  the  method 

•  Easy  to  define  the  CMCV  for  two  spheres,  two  ellipsoids,  and  a  couple  of  other  simple 
geometries 

•  In  general  finding  CMCV  might  be  tricky 

•  Notice  picture  below,  CM  of  4  is  in  A$,  CM  of  7  is  in  B4  and  CMCV  is  in  A4 

•  Finding  the  CMCV  is  the  subject  of  the  so  called  "narrow  phase  collision  detection" 

•  It'll  be  simple  in  our  case  since  we  are  going  to  work  with  simple  geometry  primitives  a 

B 

C 

belongs 
to  bin  A4 
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Stage  8:  Inclusive  Prefix  Scan 
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•  Save  to  the  side  the  number  of  contacts  in  the  last  bin  (last  entry  of  D)  d,ast 
•  Last  entry  of  D  will  get  overwritten 


o 


D-array  (Length:  Nb) 


(A2)  (A4)  (A5)  (B1)  (A3)  ... 

Run  parallel  exclusive  prefix  scan  on  D: 


0 

1 

2 

3 

4 

... 

0 

0 

1 

1 

1 

... 

CM 

< 

(A4) 

(A5) 

(B1) 

(A3) 

... 

D-array,  after 
exclusive  prefix  scan 


Total  number  of  actual  collisions:  Nr  =  D[Nh]  +  d 


last 
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Stage  9:  Populate  Array  E 

1 

2 

3 

4 

5 

•  From  the  host,  allocate  on  the  device  memory  for  array  E 
•  Array  E  stores  the  required  collision  information:  normal,  two  tangents,  etc. 

A 

(2 

)  ^ 

B 

¥ 

c 

7 

•  Number  of  entries  in  the  array:  Nc  (see  previous  slide) 

C 

-i 

a. 

\  (  Q 

/ 

•  In  parallel,  on  a  per  bin  basis  (one  thread/bin): 

•  Populate  the  E  array  with  required  info 

D 

i6 

S? 

10 1 

E 

/ 

Not  discussed  in  greater  detail,  this  is  just  like  Stage  7,  but  now  you  have  to  generate  actual  collision 
info  (stage  7  was  the  rehearsal) 


•  Thread  for  A4  will  generate  the  info  for  contact  uc" 

•  Thread  for  C2  will  generate  the  info  for  "i"  and  ud" 

•  Etc. 
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Stage  g,  details 


B,  C,  D  required  to  populate  array  E  with  collision  information 


* 

6  ') 


10 


11 


Bin  offset  in  B  and 
number  of  bodies 
touching  that  bin 


3 

2 

3 

2 

5 

7 

4 

7 

4 

7 

1 

3 

... 

B-array 

A1 

A2 

A2 

A3 

A3 

A3 

A4 

A4 

A5 

A5 

B1 

B1 

... 

1 

6 

8 

10 

3 

... 

♦ 

♦ 

t 

i 

t  C-array 

2 

(2  ') 

2 

L2 

Lj_ 

Bin  vs.  Body 
Touching  This  Bin 


±  Shows  up  2  since  there  are  two  bodies 
\  (4  &  7)  in  bin  with  offset  6  (A4) 


0 

1 

2 

3 

4 

... 

0 

0 

1 

1 

1 

... 

CM 

< 

(A4) 

(A5) 

(B1) 

(A3) 

... 

D-array,  after 
exclusive  prefix  scan 


C  and  B  are  needed  to  compute  the  collision  information 
D  needed  to  understand  where  collision  information  is  stored  in  E 
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Multiple-GPU  Collision  Detection 
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•  Processor:  AMD  Phenom  II X4  940  Black 

•  Memory:  16GB  DDR2 

•  Graphics:  4X  NVIDIATesla  C1060 

•  Power  supply  1: 1000W 

•  Power  supply  2:  750W 
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Software/Hardware  Setup 


r 

L 


Open  MP 


CUDA 


Main  Data  Set 
Results 


t  t  t  t 


Thread 

0 

Thread 

1 

Thread 

2 

Thread 

3 

■1 

PI 

PI 

PI 

GPU 

0 

GPU 
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GPU 

2 

GPU 

3 

P  ^ 

Quad  Core  AMD 
Microprocessor 


Tesla  Cl  060 
4x4  GB  Memory 
4x30720  threads 
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Spheres  -  Contacts  vs.  Time 
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Quad  Tesla  Cl  060  Configuration 


200 

180 

160 

_140 

o  120 
</) 

^100 

.1  80 
h 

60 

40 

20 

0 


0  1  2  3  4  5  6 


Contacts  (Billions) 


8/26/2013 


UNCLASSIFIED:  Distribution  Statement  A.  Approved  for  public  release.  #24389 


76 


Wisconsin  Applied  Computing  Center 


7V\ 


Speedup  -  GPU  vs.  CPU  (Bullet  library) 

[results  reported  are  for  spheres] 
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GPU:  NVIDIA  Tesla  Cl 060 
CPU:  AMD  Phenom  II  Black  X4  940  (3.0  GHz) 


200 
180 
160 
o.  140 
•§  120 
8  ioo 

to  80 
*  60 
40 
20 
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Contacts  (Millions) 
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Domain  decomposition 
& 

Inter-domain  data  exchange 


Advanced  modeling  techniques 
Algorithmic  (applied  math)  support 
Proximity  computation 

Domain  decomposition  &  Inter-domain  data  exchange 
Post-processing  (visualization) 
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Juggling  World  Record: 

64  People  Juggling  (of  all  places)  in  Madison,  Wisconsin 
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Computation  Using  Multiple  CPUs 
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CHRONO: 

Domain  decomposition  &  Inter-domain  data  exchange 


SBEL 


•  Divide  simulation  into  chunks  and  have  multiple  CPUs/GPUs  exchange  data  during 
simulation,  as  needed 

•  Elements  leave  one  subdomain  to  move  to  a  different  one  in  transparent  fashion 

•  Key  issues: 

•  Dynamic  load  balancing 

•  Establish  a  dynamic  data  exchange  protocol  (DDEP)  between  sub-domains 
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0.5  Million  Bodies  on  64  Cores 

[Penalty Approach,  MPI-based] 
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Computation  Using  Multiple  CPUs 
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Rover  Footprint,  Multi-Domain  Computation 
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Pre/Post-processinq  (visualization) 


Advanced  modeling  techniques 
Algorithmic  (applied  math)  support 
Proximity  computation 

Domain  decomposition  &  Inter-domain  data  exchange 
Post-processing  (visualization) 
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CHRONO: 

Visualization  and  Post-Processing 
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•  Rendering  very  complex  scenes  with  more  than  one  million  components 


•  Rendering  takes  longerthan  simulating 


•  Pursuing  a  rendering  pipeline  that  leverages  parallel  computing 
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Fluid  Dynamics  and  Fluid-Solid  Interaction 
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Rendering  Pipeline:  Problem  Statement 


**fSBEL 


•  Render  big  data:  efficiently  and  beautifully 


•  Have  the  flexibility  to  render  anything 


•  Make  the  rendering  process  streamlined  and  simple 
•  Provide  rendering  as  a  service 
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Rendering  Pipeline:  From  Data  to  Movie 


•  Example:  Tire-Terrain  Simulation 

•  Data 

•  Sequence  of  Height  Maps 

•  Render  Settings  File 

•  Cluster 

•  Submit  data  and  settings  to  cluster  remotely 

•  Schedule  jobs  and  render 

•  Movie 

•  Returned  upon  completion 


DATA 

* 

CLUSTER 

* 

MOVIE 


SBEL 


An  animation  is  submitted 
to  the  render  farm 


The  master  node  queues  the  frames,  evenly 
distributing  them  between  all  nodes  in  the  cluster 
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Tire  Rolling  on  DeformableTerrain 
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Tire  Rolling  on  DeformableTerrain 


**fSBEL 
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Rendering  Pipeline  Uses  Pixar's  Renderman  (PRMan) 

•  PRMan:  Engineered  to  be  fast,  efficient,  and  configurable  for  complex  scene  rendering 

•  Pixar's  PRMan:  industry's  rendering  standard 

•  Lab  supercomputer  can  run  up  to  320  instances  of  PRMan 


•  Open  source  alternatives: 


•  Aqsis 

•  JrMan 

•  Pixie 
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Chrono  Rendering  Pipeline: 


jsw, 

s CHRONO 

^RENDER 

''mw 
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RenderMan  requires  a  lot  of  work  to  configure  and 
optimize  correctly 

•  Not  aimed  at  science  and  engineering  communities 


Chrono::Rendertailored  to  science  and  engineering 


Chrono-Render- What  is  it? 

•  C++  binaries,  simple  Python  scripting  interface,  and 
succinct  XML  specification 
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Light  Robot  Operating  on  Discrete  Terrain 
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Chrono::Mobility 


rLi  n 

h  HI  h  r 

■*1  nr 

1  n 

“■  h  pt  “■  h 

CHRONO::Engine  „ 

JT_  /\  _TL  /\  f l  A  JT_  /X  -TL  /\ 

Modeling 

Approaches, 

Multi-Physics 

Numerical 

Methods 

Proximity 
Computation  & 
Contact  Detection 

Domain  Decomp. 
&  Inter-Domain 

Communication 

Pre/Post 

Processing 
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Terramechanics  Modeling  Methodologies 


1.  Empirical  methods 

•  WES  numerics,  NATO  Reference  Mobility  Model  (NRMM) 

2.  Semi-analytical 

•  Bekker-Reece  vertical  pressure/sinkage  relation 

•  Janosi-Hanamoto  slip/shear  relationship 

•  Wong/Reece  plastic  equilibrium  approach 

3.  Physics-based 

•  Finite  Element  Analysis 

•  Particle/Discrete  Element  methods  (DEM,  DVI) 

•  Meshless/Lagrangian  Methods  (SPH,  MPM,  etc.) 


SBEL 
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Terramechanics  forVehicle  Mobility,  Remarks 
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Empirical  methods  have  limited  predictive  attributes  for  general  purpose  vehicle  mobility 

Semi-Analytical  methods  have  been  applied  to  mobility  studies  with  some  success 
See:Trease,  Holtz,  Azimi,  Schmid,  Harnisch,  Slattengren 

Limitations  due  to  some  (but  not  necessarily  all)  of  the  following  assumptions: 

1.  Tire  geometry  is  2-D,  circular  in  shape 

2.  Wheel  moves  forward  at  a  constant  velocity  and  spin  rate 

3.  Wheel  moves  parallel  to  flat  ground 

4.  Soil  is  homogenous,  perfectly  plastic  medium 

FEA  or  DEM  are  accurate,  but  computationally  expensive 
Madsen/Heyn/Negrut/Lamb  (demonstrated  shortly) 
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Chrono::Mobility-  Goals 
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•  Develop  general  purpose  simulation  capability  for  analysis  of  wheeled/tracked 
vehicle  mobility  on  deformable  terrain 


a)  Handle  3-D  tire/track  geometry  to  accurately  estimate  contact  patch  size  and  shape 

b)  Handle  general  3-D  terrain  geometry  to  allow  for  realistic  mobility  scenarios 

c)  Represent  the  terrain  in  a  way  that  considers  soil  stress  state  and  loading  history  in  a 
volumetric  sense,  depending  on  soil  type 

•  Cohesive  soils- compaction 

•  Dry  granular  soils  -  shear  failure  and  flow 

•  Brittle  soils -fracture,  shear  failure  and  flow 
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Traction  Element  Geometry  Representation 


**fSBEL 


•  3-D  geometry  description  for  tire/terrain  collision  query 
•  Discretized  at  a  resolution  to  capture  tread/lug  geometry 

•  APIs  modularized  so  that  terrain  database  accepts  generalized 
traction  geometry  when  queried 


•  Directly  use  Wavefront  (.obj)  solid  models  (top) 
•  Captures  complicated  tread/lug  geometry 


•  Can  also  represent  vehicle  hull  geometry 
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Simulation  Framework 


Vehicle  model 
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Simulation  Framework:  Implementation  Details 


•  Fast,  asynchronous  execution 
•  Not  necessary  for  vehicle  model  to  wait 
for  query  to  complete 


•  PassSTI  Force/Moment  to  vehicle 


•  Leverages  both  multi-core  and  GPU 
parallelism 


VEHICLE 


VTIM  database  DLL 
function 


boost: : scope_lock 


Update  Tire 
Geometry 


CPU 


Query  terrain 
geometry  under 
wheel 

(Vti^geomquery) 


OpenFlight  DB 
-  raw  polygon 
terrain  geometry 


Fill  in  the  SMASH 
layer  data 

Terrain  loaded  in 

◄-No - 
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already? 

Yes 

▼ 

Perform  collision 

detection 

Update  tire  slip 
information 


VEHICLE 
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Chrono::  Mobility,  Today 
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•  Multibody  dynamics  vehicle  model  entirely  in  Chrono 

•  Generalized  3-D  tire/track  and  terrain  geometry  representations 

•  Visco-elastic-plastic  soil  model  captures  soil  compaction  due  to  vehicle  loads 


•  Leverages  other  members  of  the  Chrono  family 
•  Based  on  Standard  Tire  Interface  (STI)  &  VehicleTerrain  Interface  VTI 


•  Discrete  terrain  simulation  carried  out  in  the  same  framework 
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Light  Robot  Operating  on  Discrete  Terrain 
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Mobility  on  GranularTerrain  w/  Cohesion 
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Mobility  on  GranularTerrain  w/ Cohesion:  Close-Up 
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WhatCanYou  Do  With  a  Validated  Predictive  Tool? 
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Fording,  Future  Plans:  Coupled  Fluid-Structure  Interaction 


9.3  years  of  GPU  time  for  simulating  Fluid-Solid  Interaction  problems 
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Departing  Thoughts 
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Chrono  -The  Long  View 
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•  Chrono-effort  focused  on  four  thrusts: 

1.  Validation- useful 

2.  Pre/Post  -  friendly 

3.  New  features -versatile 

4.  Leverage  advanced  computing  -  fast 
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Wisconsin  Applied  Computing  Center 
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•  Our  group,  Computational  Dynamics,  has  12  members 
•  Three  Faculty/Researchers... 
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Nine  Graduate  and  Undergraduate  Students. 
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Closing  Remarks 


•  We  are  focused  on  physics-based  simulation 


SBEL 


•  Vision: 

•  Solve  real-world  problems  (pursue  relevant  questions) 

•  Put  computers  and  good  ideas  to  work 

•  Build  upon  partnerships  and  collaborations 

•  Make  outcomes  of  our  work  available  to  users:  release  early,  release  often  (BSD-3  license) 


•  Approaching  physics-based  simulation  in  a  holistic  fashion  through  Chrono 

•  Modeling  +  numerical  solution  +  visualization 

•  Rely  on  emerging  hardware  for  fast  simulation 
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ThankYou. 


negrut@wisc.edu 

Simulation  Based  Engineering  Lab 
Wisconsin  Applied  Computing  Center 

PPT  presentation  &  animations  will  be  available  on-line  for  download. 
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